Overview

Dataset statistics

Number of variables21
Number of observations41188
Missing cells0
Missing cells (%)0.0%
Duplicate rows12
Duplicate rows (%)< 0.1%
Total size in memory30.3 MiB
Average record size in memory770.4 B

Variable types

NUM10
CAT10
BOOL1

Reproduction

Analysis started2020-10-03 17:57:03.035399
Analysis finished2020-10-03 17:57:55.402706
Versionpandas-profiling v2.5.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Dataset has 12 (< 0.1%) duplicate rows Duplicates
euribor3m is highly correlated with emp.var.rate and 1 other fieldsHigh Correlation
emp.var.rate is highly correlated with euribor3m and 1 other fieldsHigh Correlation
nr.employed is highly correlated with emp.var.rate and 1 other fieldsHigh Correlation
previous has 35563 (86.3%) zeros Zeros

Variables

age
Real number (ℝ≥0)

Distinct count78
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.02406041
Minimum17
Maximum98
Zeros0
Zeros (%)0.0%
Memory size321.9 KiB

Quantile statistics

Minimum17
5-th percentile26
Q132
median38
Q347
95-th percentile58
Maximum98
Range81
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.42124998
Coefficient of variation (CV)0.2603746315
Kurtosis0.7913115312
Mean40.02406041
Median Absolute Deviation (MAD)8.461535774
Skewness0.7846968158
Sum1648511
Variance108.6024512
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[17. 18.5 20.5 22.5 23.5 ... 83.5 86.5 87.5 88.5 98. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
31 1947 4.7%
 
32 1846 4.5%
 
33 1833 4.5%
 
36 1780 4.3%
 
35 1759 4.3%
 
34 1745 4.2%
 
30 1714 4.2%
 
37 1475 3.6%
 
29 1453 3.5%
 
39 1432 3.5%
 
Other values (68) 24204 58.8%
 
ValueCountFrequency (%) 
17 5 < 0.1%
 
18 28 0.1%
 
19 42 0.1%
 
20 65 0.2%
 
21 102 0.2%
 
ValueCountFrequency (%) 
98 2 < 0.1%
 
95 1 < 0.1%
 
94 1 < 0.1%
 
92 4 < 0.1%
 
91 2 < 0.1%
 

job
Categorical

Distinct count12
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
admin.
10422
blue-collar
9254
technician
6743
services
3969
management
2924
Other values (7)
7876
ValueCountFrequency (%) 
admin. 10422 25.3%
 
blue-collar 9254 22.5%
 
technician 6743 16.4%
 
services 3969 9.6%
 
management 2924 7.1%
 
retired 1720 4.2%
 
entrepreneur 1456 3.5%
 
self-employed 1421 3.5%
 
housemaid 1060 2.6%
 
unemployed 1014 2.5%
 
Other values (2) 1205 2.9%
 

Length

Max length13
Mean length8.955229679
Min length6
ValueCountFrequency (%) 
Lowercase_Letter 22 91.7%
 
Dash_Punctuation 1 4.2%
 
Other_Punctuation 1 4.2%
 
ValueCountFrequency (%) 
Latin 22 91.7%
 
Common 2 8.3%
 
ValueCountFrequency (%) 
ASCII 24 100.0%
 

marital
Categorical

Distinct count4
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
married
24928
single
11568
divorced
 
4612
unknown
 
80
ValueCountFrequency (%) 
married 24928 60.5%
 
single 11568 28.1%
 
divorced 4612 11.2%
 
unknown 80 0.2%
 

Length

Max length8
Mean length6.831115859
Min length6
ValueCountFrequency (%) 
Lowercase_Letter 16 100.0%
 
ValueCountFrequency (%) 
Latin 16 100.0%
 
ValueCountFrequency (%) 
ASCII 16 100.0%
 

education
Categorical

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
university.degree
12168
high.school
9515
basic.9y
6045
professional.course
5243
basic.4y
4176
Other values (3)
4041
ValueCountFrequency (%) 
university.degree 12168 29.5%
 
high.school 9515 23.1%
 
basic.9y 6045 14.7%
 
professional.course 5243 12.7%
 
basic.4y 4176 10.1%
 
basic.6y 2292 5.6%
 
unknown 1731 4.2%
 
illiterate 18 < 0.1%
 

Length

Max length19
Mean length12.7109595
Min length7
ValueCountFrequency (%) 
Lowercase_Letter 21 84.0%
 
Decimal_Number 3 12.0%
 
Other_Punctuation 1 4.0%
 
ValueCountFrequency (%) 
Latin 21 84.0%
 
Common 4 16.0%
 
ValueCountFrequency (%) 
ASCII 25 100.0%
 

default
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
no
32588
unknown
8597
yes
 
3
ValueCountFrequency (%) 
no 32588 79.1%
 
unknown 8597 20.9%
 
yes 3 < 0.1%
 

Length

Max length7
Mean length3.043702049
Min length2
ValueCountFrequency (%) 
Lowercase_Letter 8 100.0%
 
ValueCountFrequency (%) 
Latin 8 100.0%
 
ValueCountFrequency (%) 
ASCII 8 100.0%
 

housing
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
yes
21576
no
18622
unknown
 
990
ValueCountFrequency (%) 
yes 21576 52.4%
 
no 18622 45.2%
 
unknown 990 2.4%
 

Length

Max length7
Mean length2.644022531
Min length2
ValueCountFrequency (%) 
Lowercase_Letter 8 100.0%
 
ValueCountFrequency (%) 
Latin 8 100.0%
 
ValueCountFrequency (%) 
ASCII 8 100.0%
 

loan
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
no
33950
yes
 
6248
unknown
 
990
ValueCountFrequency (%) 
no 33950 82.4%
 
yes 6248 15.2%
 
unknown 990 2.4%
 

Length

Max length7
Mean length2.271875303
Min length2
ValueCountFrequency (%) 
Lowercase_Letter 8 100.0%
 
ValueCountFrequency (%) 
Latin 8 100.0%
 
ValueCountFrequency (%) 
ASCII 8 100.0%
 

contact
Categorical

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
cellular
26144
telephone
15044
ValueCountFrequency (%) 
cellular 26144 63.5%
 
telephone 15044 36.5%
 

Length

Max length9
Mean length8.365252015
Min length8
ValueCountFrequency (%) 
Lowercase_Letter 11 100.0%
 
ValueCountFrequency (%) 
Latin 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

month
Categorical

Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
may
13769
jul
7174
aug
6178
jun
5318
nov
4101
Other values (5)
4648
ValueCountFrequency (%) 
may 13769 33.4%
 
jul 7174 17.4%
 
aug 6178 15.0%
 
jun 5318 12.9%
 
nov 4101 10.0%
 
apr 2632 6.4%
 
oct 718 1.7%
 
sep 570 1.4%
 
mar 546 1.3%
 
dec 182 0.4%
 

Length

Max length3
Mean length3
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 17 100.0%
 
ValueCountFrequency (%) 
Latin 17 100.0%
 
ValueCountFrequency (%) 
ASCII 17 100.0%
 

day_of_week
Categorical

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
thu
8623
mon
8514
wed
8134
tue
8090
fri
7827
ValueCountFrequency (%) 
thu 8623 20.9%
 
mon 8514 20.7%
 
wed 8134 19.7%
 
tue 8090 19.6%
 
fri 7827 19.0%
 

Length

Max length3
Mean length3
Min length3
ValueCountFrequency (%) 
Lowercase_Letter 12 100.0%
 
ValueCountFrequency (%) 
Latin 12 100.0%
 
ValueCountFrequency (%) 
ASCII 12 100.0%
 

duration
Real number (ℝ≥0)

Distinct count1544
Unique (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean258.2850102
Minimum0
Maximum4918
Zeros4
Zeros (%)< 0.1%
Memory size321.9 KiB

Quantile statistics

Minimum0
5-th percentile36
Q1102
median180
Q3319
95-th percentile752.65
Maximum4918
Range4918
Interquartile range (IQR)217

Descriptive statistics

Standard deviation259.2792488
Coefficient of variation (CV)1.003849386
Kurtosis20.24793801
Mean258.2850102
Median Absolute Deviation (MAD)171.6661326
Skewness3.263141255
Sum10638243
Variance67225.72888
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000e+00 4.500e+00 6.500e+00 2.750e+01 3.350e+01 ... 1.626e+03 2.091e+03 2.501e+03 3.714e+03 4.918e+03], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
85 170 0.4%
 
90 170 0.4%
 
136 168 0.4%
 
73 167 0.4%
 
124 164 0.4%
 
87 162 0.4%
 
72 161 0.4%
 
104 161 0.4%
 
111 160 0.4%
 
106 159 0.4%
 
Other values (1534) 39546 96.0%
 
ValueCountFrequency (%) 
0 4 < 0.1%
 
1 3 < 0.1%
 
2 1 < 0.1%
 
3 3 < 0.1%
 
4 12 < 0.1%
 
ValueCountFrequency (%) 
4918 1 < 0.1%
 
4199 1 < 0.1%
 
3785 1 < 0.1%
 
3643 1 < 0.1%
 
3631 1 < 0.1%
 

campaign
Real number (ℝ≥0)

Distinct count42
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.567592503
Minimum1
Maximum56
Zeros0
Zeros (%)0.0%
Memory size321.9 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile7
Maximum56
Range55
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.770013543
Coefficient of variation (CV)1.078836903
Kurtosis36.97979514
Mean2.567592503
Median Absolute Deviation (MAD)1.634209949
Skewness4.762506697
Sum105754
Variance7.672975028
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 2.5 3.5 4.5 ... 21.5 24.5 31.5 36. 56. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1 17642 42.8%
 
2 10570 25.7%
 
3 5341 13.0%
 
4 2651 6.4%
 
5 1599 3.9%
 
6 979 2.4%
 
7 629 1.5%
 
8 400 1.0%
 
9 283 0.7%
 
10 225 0.5%
 
Other values (32) 869 2.1%
 
ValueCountFrequency (%) 
1 17642 42.8%
 
2 10570 25.7%
 
3 5341 13.0%
 
4 2651 6.4%
 
5 1599 3.9%
 
ValueCountFrequency (%) 
56 1 < 0.1%
 
43 2 < 0.1%
 
42 2 < 0.1%
 
41 1 < 0.1%
 
40 2 < 0.1%
 

pdays
Real number (ℝ≥0)

Distinct count27
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean962.475454
Minimum0
Maximum999
Zeros15
Zeros (%)< 0.1%
Memory size321.9 KiB

Quantile statistics

Minimum0
5-th percentile999
Q1999
median999
Q3999
95-th percentile999
Maximum999
Range999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation186.9109073
Coefficient of variation (CV)0.194198103
Kurtosis22.22946263
Mean962.475454
Median Absolute Deviation (MAD)70.3621595
Skewness-4.922189916
Sum39642439
Variance34935.68728
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 1.5 2.5 3.5 4.5 ... 15.5 18.5 26.5 513. 999. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
999 39673 96.3%
 
3 439 1.1%
 
6 412 1.0%
 
4 118 0.3%
 
9 64 0.2%
 
2 61 0.1%
 
7 60 0.1%
 
12 58 0.1%
 
10 52 0.1%
 
5 46 0.1%
 
Other values (17) 205 0.5%
 
ValueCountFrequency (%) 
0 15 < 0.1%
 
1 26 0.1%
 
2 61 0.1%
 
3 439 1.1%
 
4 118 0.3%
 
ValueCountFrequency (%) 
999 39673 96.3%
 
27 1 < 0.1%
 
26 1 < 0.1%
 
25 1 < 0.1%
 
22 3 < 0.1%
 

previous
Real number (ℝ≥0)

ZEROS
Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1729629989
Minimum0
Maximum7
Zeros35563
Zeros (%)86.3%
Memory size321.9 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4949010798
Coefficient of variation (CV)2.861311858
Kurtosis20.10881622
Mean0.1729629989
Median Absolute Deviation (MAD)0.2986832636
Skewness3.832042243
Sum7124
Variance0.2449270788
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0. 0.5 1.5 2.5 3.5 4.5 5.5 7. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 35563 86.3%
 
1 4561 11.1%
 
2 754 1.8%
 
3 216 0.5%
 
4 70 0.2%
 
5 18 < 0.1%
 
6 5 < 0.1%
 
7 1 < 0.1%
 
ValueCountFrequency (%) 
0 35563 86.3%
 
1 4561 11.1%
 
2 754 1.8%
 
3 216 0.5%
 
4 70 0.2%
 
ValueCountFrequency (%) 
7 1 < 0.1%
 
6 5 < 0.1%
 
5 18 < 0.1%
 
4 70 0.2%
 
3 216 0.5%
 

poutcome
Categorical

Distinct count3
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
nonexistent
35563
failure
 
4252
success
 
1373
ValueCountFrequency (%) 
nonexistent 35563 86.3%
 
failure 4252 10.3%
 
success 1373 3.3%
 

Length

Max length11
Mean length10.45372439
Min length7
ValueCountFrequency (%) 
Lowercase_Letter 13 100.0%
 
ValueCountFrequency (%) 
Latin 13 100.0%
 
ValueCountFrequency (%) 
ASCII 13 100.0%
 

emp.var.rate
Real number (ℝ)

HIGH CORRELATION
Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08188550063
Minimum-3.4
Maximum1.4
Zeros0
Zeros (%)0.0%
Memory size321.9 KiB

Quantile statistics

Minimum-3.4
5-th percentile-2.9
Q1-1.8
median1.1
Q31.4
95-th percentile1.4
Maximum1.4
Range4.8
Interquartile range (IQR)3.2

Descriptive statistics

Standard deviation1.570959741
Coefficient of variation (CV)19.18483405
Kurtosis-1.062631525
Mean0.08188550063
Median Absolute Deviation (MAD)1.42283644
Skewness-0.7240955492
Sum3372.7
Variance2.467914506
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-3.4 -2.95 -2.35 -1.75 -1.4 -0.65 -0.15 0.5 1.25 1.4 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
1.4 16234 39.4%
 
-1.8 9184 22.3%
 
1.1 7763 18.8%
 
-0.1 3683 8.9%
 
-2.9 1663 4.0%
 
-3.4 1071 2.6%
 
-1.7 773 1.9%
 
-1.1 635 1.5%
 
-3 172 0.4%
 
-0.2 10 < 0.1%
 
ValueCountFrequency (%) 
-3.4 1071 2.6%
 
-3 172 0.4%
 
-2.9 1663 4.0%
 
-1.8 9184 22.3%
 
-1.7 773 1.9%
 
ValueCountFrequency (%) 
1.4 16234 39.4%
 
1.1 7763 18.8%
 
-0.1 3683 8.9%
 
-0.2 10 < 0.1%
 
-1.1 635 1.5%
 

cons.price.idx
Real number (ℝ≥0)

Distinct count26
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93.57566437
Minimum92.201
Maximum94.767
Zeros0
Zeros (%)0.0%
Memory size321.9 KiB

Quantile statistics

Minimum92.201
5-th percentile92.713
Q193.075
median93.749
Q393.994
95-th percentile94.465
Maximum94.767
Range2.566
Interquartile range (IQR)0.919

Descriptive statistics

Standard deviation0.578840049
Coefficient of variation (CV)0.00618579684
Kurtosis-0.8298085772
Mean93.57566437
Median Absolute Deviation (MAD)0.5098100355
Skewness-0.2308876514
Sum3854194.464
Variance0.3350558023
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[92.201 92.29 92.405 92.45 92.559 ... 94.127 94.207 94.34 94.533 94.767], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
93.994 7763 18.8%
 
93.918 6685 16.2%
 
92.893 5794 14.1%
 
93.444 5175 12.6%
 
94.465 4374 10.6%
 
93.2 3616 8.8%
 
93.075 2458 6.0%
 
92.201 770 1.9%
 
92.963 715 1.7%
 
92.431 447 1.1%
 
Other values (16) 3391 8.2%
 
ValueCountFrequency (%) 
92.201 770 1.9%
 
92.379 267 0.6%
 
92.431 447 1.1%
 
92.469 178 0.4%
 
92.649 357 0.9%
 
ValueCountFrequency (%) 
94.767 128 0.3%
 
94.601 204 0.5%
 
94.465 4374 10.6%
 
94.215 311 0.8%
 
94.199 303 0.7%
 

cons.conf.idx
Real number (ℝ)

Distinct count26
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-40.50260027
Minimum-50.8
Maximum-26.9
Zeros0
Zeros (%)0.0%
Memory size321.9 KiB

Quantile statistics

Minimum-50.8
5-th percentile-47.1
Q1-42.7
median-41.8
Q3-36.4
95-th percentile-33.6
Maximum-26.9
Range23.9
Interquartile range (IQR)6.3

Descriptive statistics

Standard deviation4.628197856
Coefficient of variation (CV)-0.1142691537
Kurtosis-0.3585583105
Mean-40.50260027
Median Absolute Deviation (MAD)3.938263659
Skewness0.3031798587
Sum-1668221.1
Variance21.4202154
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-50.8 -49.75 -48.3 -46.65 -46.05 ... -33.3 -32.2 -29.95 -28.35 -26.9 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-36.4 7763 18.8%
 
-42.7 6685 16.2%
 
-46.2 5794 14.1%
 
-36.1 5175 12.6%
 
-41.8 4374 10.6%
 
-42 3616 8.8%
 
-47.1 2458 6.0%
 
-31.4 770 1.9%
 
-40.8 715 1.7%
 
-26.9 447 1.1%
 
Other values (16) 3391 8.2%
 
ValueCountFrequency (%) 
-50.8 128 0.3%
 
-50 282 0.7%
 
-49.5 204 0.5%
 
-47.1 2458 6.0%
 
-46.2 5794 14.1%
 
ValueCountFrequency (%) 
-26.9 447 1.1%
 
-29.8 267 0.6%
 
-30.1 357 0.9%
 
-31.4 770 1.9%
 
-33 172 0.4%
 

euribor3m
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count316
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.621290813
Minimum0.634
Maximum5.045
Zeros0
Zeros (%)0.0%
Memory size321.9 KiB

Quantile statistics

Minimum0.634
5-th percentile0.797
Q11.344
median4.857
Q34.961
95-th percentile4.966
Maximum5.045
Range4.411
Interquartile range (IQR)3.617

Descriptive statistics

Standard deviation1.734447405
Coefficient of variation (CV)0.4789583313
Kurtosis-1.406802622
Mean3.621290813
Median Absolute Deviation (MAD)1.607967633
Skewness-0.7091879564
Sum149153.726
Variance3.0083078
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.634 0.6355 0.641 0.6475 0.6515 ... 4.9635 4.9655 4.969 4.985 5.045 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
4.857 2868 7.0%
 
4.962 2613 6.3%
 
4.963 2487 6.0%
 
4.961 1902 4.6%
 
4.856 1210 2.9%
 
4.964 1175 2.9%
 
1.405 1169 2.8%
 
4.965 1071 2.6%
 
4.864 1044 2.5%
 
4.96 1013 2.5%
 
Other values (306) 24636 59.8%
 
ValueCountFrequency (%) 
0.634 8 < 0.1%
 
0.635 43 0.1%
 
0.636 14 < 0.1%
 
0.637 6 < 0.1%
 
0.638 7 < 0.1%
 
ValueCountFrequency (%) 
5.045 9 < 0.1%
 
5 7 < 0.1%
 
4.97 172 0.4%
 
4.968 992 2.4%
 
4.967 643 1.6%
 

nr.employed
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count11
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5167.035911
Minimum4963.6
Maximum5228.1
Zeros0
Zeros (%)0.0%
Memory size321.9 KiB

Quantile statistics

Minimum4963.6
5-th percentile5017.5
Q15099.1
median5191
Q35228.1
95-th percentile5228.1
Maximum5228.1
Range264.5
Interquartile range (IQR)129

Descriptive statistics

Standard deviation72.25152767
Coefficient of variation (CV)0.01398316732
Kurtosis-0.003760375696
Mean5167.035911
Median Absolute Deviation (MAD)62.31807449
Skewness-1.044262407
Sum212819875.1
Variance5220.28325
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[4963.6 5000.15 5013.1 5020.5 5049.85 ... 5137.7 5183.65 5193.4 5211.95 5228.1 ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5228.1 16234 39.4%
 
5099.1 8534 20.7%
 
5191 7763 18.8%
 
5195.8 3683 8.9%
 
5076.2 1663 4.0%
 
5017.5 1071 2.6%
 
4991.6 773 1.9%
 
5008.7 650 1.6%
 
4963.6 635 1.5%
 
5023.5 172 0.4%
 
ValueCountFrequency (%) 
4963.6 635 1.5%
 
4991.6 773 1.9%
 
5008.7 650 1.6%
 
5017.5 1071 2.6%
 
5023.5 172 0.4%
 
ValueCountFrequency (%) 
5228.1 16234 39.4%
 
5195.8 3683 8.9%
 
5191 7763 18.8%
 
5176.3 10 < 0.1%
 
5099.1 8534 20.7%
 

y
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
no
36548
yes
 
4640
ValueCountFrequency (%) 
no 36548 88.7%
 
yes 4640 11.3%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

agejobmaritaleducationdefaulthousingloancontactmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedy
056housemaidmarriedbasic.4ynononotelephonemaymon26119990nonexistent1.193.994-36.44.8575191.0no
157servicesmarriedhigh.schoolunknownnonotelephonemaymon14919990nonexistent1.193.994-36.44.8575191.0no
237servicesmarriedhigh.schoolnoyesnotelephonemaymon22619990nonexistent1.193.994-36.44.8575191.0no
340admin.marriedbasic.6ynononotelephonemaymon15119990nonexistent1.193.994-36.44.8575191.0no
456servicesmarriedhigh.schoolnonoyestelephonemaymon30719990nonexistent1.193.994-36.44.8575191.0no
545servicesmarriedbasic.9yunknownnonotelephonemaymon19819990nonexistent1.193.994-36.44.8575191.0no
659admin.marriedprofessional.coursenononotelephonemaymon13919990nonexistent1.193.994-36.44.8575191.0no
741blue-collarmarriedunknownunknownnonotelephonemaymon21719990nonexistent1.193.994-36.44.8575191.0no
824techniciansingleprofessional.coursenoyesnotelephonemaymon38019990nonexistent1.193.994-36.44.8575191.0no
925servicessinglehigh.schoolnoyesnotelephonemaymon5019990nonexistent1.193.994-36.44.8575191.0no

Last rows

agejobmaritaleducationdefaulthousingloancontactmonthday_of_weekdurationcampaignpdayspreviouspoutcomeemp.var.ratecons.price.idxcons.conf.idxeuribor3mnr.employedy
4117862retiredmarrieduniversity.degreenononocellularnovthu483263success-1.194.767-50.81.0314963.6yes
4117964retireddivorcedprofessional.coursenoyesnocellularnovfri15139990nonexistent-1.194.767-50.81.0284963.6no
4118036admin.marrieduniversity.degreenononocellularnovfri25429990nonexistent-1.194.767-50.81.0284963.6no
4118137admin.marrieduniversity.degreenoyesnocellularnovfri28119990nonexistent-1.194.767-50.81.0284963.6yes
4118229unemployedsinglebasic.4ynoyesnocellularnovfri112191success-1.194.767-50.81.0284963.6no
4118373retiredmarriedprofessional.coursenoyesnocellularnovfri33419990nonexistent-1.194.767-50.81.0284963.6yes
4118446blue-collarmarriedprofessional.coursenononocellularnovfri38319990nonexistent-1.194.767-50.81.0284963.6no
4118556retiredmarrieduniversity.degreenoyesnocellularnovfri18929990nonexistent-1.194.767-50.81.0284963.6no
4118644technicianmarriedprofessional.coursenononocellularnovfri44219990nonexistent-1.194.767-50.81.0284963.6yes
4118774retiredmarriedprofessional.coursenoyesnocellularnovfri23939991failure-1.194.767-50.81.0284963.6no